Virus Evolution
◐ Oxford University Press (OUP)
Preprints posted in the last 7 days, ranked by how well they match Virus Evolution's content profile, based on 140 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.
Colliot, L.; Garrot, V.; Petit, P.; Zhukova, A.; Chaix, M.-L.; Mayer, L.; Alizon, S.
Show abstract
Understanding the dynamics of HIV epidemics is important to control them effectively. Classical methods that mainly rely on occurrence data are limited by the fact that an unknown part of the epidemic eludes sampling. Since the early 2000s, phylodynamic methods have enabled the estimation of key epidemiological parameters from virus genetic sequence data. These methods have the advantage of being less sensitive to partial sampling and to provide insights about epidemic history that even predates the first samples. In this study, we analysed 2,205 HIV sequences from the French ANRS PRIMO C06 cohort. We identified and were able to reconstruct the temporal dynamics of two large clades that represent the HIV-1 epidemics in the country. Using Bayesian phylodynamic inference models, we found that the first clade, from subtype B, originated in the end of 1970s, grew rapidly during the 80s before decreasing from 2000 to 2015 and stagnating since then. The second clade, from circulating recombinant form CRF02_AG, emerged and spread in the 80s, grew again in the early 2000s, before declining slightly. We also estimated key epidemiological parameters associated with each clade. Finally, using numerical simulations, we investigated prospective scenarios and assessed the possibility to meet the 2030 UNAIDS targets. This is one of the rare studies to analyse the HIV epidemic in France using molecular epidemiology methods. It highlights the value of routine HIV sequence data for studying past epidemic trends or designing public health policies.
Ngo, A.; Guindon, S.; Pedergnana, V.
Show abstract
Understanding how genetic variation in pathogens influences clinical phenotypes observed in infected hosts is a fundamental challenge in evolutionary genomics and public health. Phenotypic traits such as infection severity are often non-randomly distributed within the pathogens phylogeny, suggesting the existence of evolutionary determinants but also violating the independence assumption underlying classical genome-wide association studies and potentially leading to inflated false positive rates. We present MutaPhy, a phylogeny-based method aimed at detecting correlations between a binary host phenotype and the corresponding pathogen genome by directly utilizing the hierarchical structure of phylogenetic trees. MutaPhy encompasses three different scales: (i) a subtree scale, on which relevant clades over-representing the phenotype of interest are detected using permutation-based tests; (ii) a tree scale, which agglomerates local signals into a global association statistics; and (iii) a site scale, whereby candidate mutational events on branches leading to significant clades are examined using ancestral sequence reconstruction. We evaluate the statistical behavior and detection performance of MutaPhy using simulations under diverse evolutionary scenarios. We also compare this tool to several existing phylogenetic association methods. As illustrative applications, we apply MutaPhy to dengue virus and hepatitis C virus datasets associated to clinical phenotypes in human hosts. Our results highlight the ability of the proposed approach to detect viral lineages associated to over-represented phenotypes while revealing limited evidence for robust mutation-level associations in these particular datasets. Altogether, MutaPhy provides a framework for guiding genotype-phenotype association analyses by leveraging phylogenetic structure, thereby reducing false positive findings and improving the interpretability of association signals.
Crespo-Bellido, A.; Trovao, N. S.; Puryear, W.; Maksiaev, A.; Pekar, J. E.; Baele, G.; Dellicour, S.; Nelson, M. I.
Show abstract
Since 2021, highly pathogenic avian influenza viruses (HPAIVs) belonging to H5N1 clade 2.3.4.4b have circulated widely in North American wild birds and repeatedly spilled over into mammals. In 2025, the first H5N1-associated deaths in humans were recorded in the Western hemisphere, raising questions about how the ongoing evolution of the virus in wild birds impacts spillover risk. Here, our analysis of 21,471 H5N1 genomes identified an evolutionary shift in mid-2024, driven by interhemispheric migration from Asia and reassortment with new antigens. The genotypes that dominated the early years of North Americas H5N1 epizootic traced their ancestry back to Europe, but Asia was the source of new "D1.1" genotype viruses that (a) spread faster, (b) have higher reassortment potential, (c) a broader host range, (d) repeatedly spill over to bovines, and (e) cause severe disease in humans, including non-farm workers.
Ochola, G.; Pulkkinen, E.; Ogola, J. G.; Makela, H.; Masika, M.; Vauhkonen, H.; Smura, T.; Jaaskelainen, A. J.; Anzala, O.; Vapalahti, O.; Mweu, A. W.; Forbes, K. M.; Lindahl, J. F.; Laakkonen, J.; Uusitalo, J.; Altan, E.; Korhonen, E. M.; Sironen, T.
Show abstract
The majority of emerging infectious diseases are zoonotic, having their origin in wildlife before spilling over into the human population. While small mammals are recognized as critical reservoirs for these viruses, their viral diversity remains largely uncharacterized across many African countries. We conducted molecular surveillance of synanthropic rodents and shrews in the Kibera informal settlement in Nairobi and the rural Taita Hills region of Kenya to detect and characterize potential zoonotic viruses. Tissue samples from 228 rodents and shrews were screened for six viral families using PCR assays. Rat hepatitis E virus (HEV) (Rocahepevirus ratti), a rodent-associated virus with potential for human spillover, was identified in Mus musculus and Rattus norvegicus from Kibera. NGS was conducted for the HEV positive samples, and we obtained two near-complete HEV genomes from Rattus norvegicus, which clustered within rodent-associated HEV genotypes in the phylogenetic analysis. The two sequences from the Rattus norvegicus cluster together, indicating a close genetic relationship. Paramyxoviruses belonging to the genera Jeilongvirus and Parahenipavirus were detected both from Taita and Kibera in nine different samples from Rattus norvegicus, Mus minutoides, Crocidura sp and Acomys ignitus. One paramyxovirus positive sample (Acomys ignitus) from Taita was selected for further sequencing with NGS, and a complete genome of a new jeilongvirus was assembled. Phylogenetic analysis of the detected viruses confirmed the close relation to previously known rodent-borne jeilongviruses but also revealed potentially novel jeilong- and parahenipavirus species. Our findings highlight the circulation of potentially zoonotic viruses in both urban and rural small mammals in Kenya. It emphasizes the necessity of continued genomic surveillance of zoonotic viruses to mitigate risks of their spillover into human populations. HighlightsO_LISurveillance reveals diverse rodent-borne viruses circulating in Kenya. C_LIO_LIRat-HEV was detected in Rattus norvegicus and Mus musculus from an urban low-income area. C_LIO_LIParamyxoviruses were detected across multiple rodent and shrew species, including novel Acomys ignitus jeilongvirus. C_LI Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=139 SRC="FIGDIR/small/719784v1_ufig1.gif" ALT="Figure 1"> View larger version (66K): org.highwire.dtl.DTLVardef@194e81eorg.highwire.dtl.DTLVardef@11342cdorg.highwire.dtl.DTLVardef@186ad97org.highwire.dtl.DTLVardef@eeb516_HPS_FORMAT_FIGEXP M_FIG C_FIG
Billet, L. S.; Skelly, D. K.; Sauer, E. L.
Show abstract
Pathogens that persist subclinically across many wildlife populations can drive mass mortality in others. Mass mortality is often abrupt, and the timing can be difficult to predict from host or habitat features alone. In a recent field study tracking ranavirus epizootics in wood frog (Rana sylvatica) breeding ponds, we found that no environmental or biotic feature reliably predicted die-off occurrence or timing. Instead, the trajectory of viral accumulation in the water column was the strongest dynamic predictor of mass mortality. Infected hosts shed virus throughout epizootics, but the influence of waterborne viral concentration on disease progression was apparent only near die-off onset. This pattern suggests a potential threshold-dependent feedback operating through the shared viral environment. Here, we develop a compartmental model linking waterborne viral concentration to the rate at which subclinical infections progress to clinical, high-shedding states within already-infected hosts. We show that a dose-dependent progression model generates the two-phase epizootic trajectory observed in natural die-offs: prolonged subclinical circulation followed by abrupt clinical transition after environmental virus crosses an escalation threshold. The model exhibits a sharp phase transition between subclinical circulation and mass mortality, governed mainly by the clinical-to-subclinical shedding ratio, host density, and pond volume. Existing explanations for die-off variation emphasize individual-level susceptibility, but our model demonstrates that dose-dependent environmental feedback, a mechanism not previously formalized at the population level, can generate the transition from subclinical infection to mass mortality without invoking individual variation in host susceptibility. This mechanism may apply in any system where hosts share a bounded environment, pathogen dose influences disease severity, and pathogen shedding increases with disease progression.
Lara, A. Z.; Hardy, R. W.; Phelps, M.; Newton, I.
Show abstract
The ability of the bacterial endosymbiont Wolbachia pipientis to block arboviruses in its mosquito host may be impinged by host genetic variation, leading to reduced efficacy in field releases. Across a large collection of Drosophila lines carrying natural genetic variation, we found that viral replication varied greatly in the absence of Wolbachia. However, the introduction of the symbiont reduced viral load in each background to similar levels, near the limit of detection. Therefore, Wolbachia-mediated viral blocking is seemingly robust against host genetic background. A genome-wide association study harnessing the variation in the viral loads across the Wolbachia-free set identified rhoGAP18B and betaCOP as host factors that contribute to SINV replication; furthermore, the gene products of which seemingly interact with each other in the context of cytoskeletal dynamics. These results shed light on host requirements for SINV replication and suggest possible avenues by which Wolbachia may encroach upon them during blocking.
Revell, L. J.; Alencar, L. R. V.; Alfaro, M. E.; Dain, J.; Hill, N. J.; Jones, M.; Martinet, K. M.; Romero-Alarcon, V.; Harmon, L. J.
Show abstract
The practical utility of many modern phylogenetic comparative methods can depend on how accurately mathematical models capture the evolutionary process of traits. Boucher and Demery (2016) described a new quantitative trait model, Brownian motion with reflective limits, that they anticipated might be of use in testing hypotheses about a particular sort of constraint on phenotypic character evolution. Since their analytic solution for the probability function under this bounded evolutionary scenario was not practical to evaluate for reasonably-sized trees, Boucher and Demery (2016) also identified a creative technique for computing the likelihood of their model. The basis of this methodology derives from the convergence of an equal-rates, symmetric, ordered Markov chain and continuous stochastic diffusion in the limit as the number of steps in our chain goes to {infty} (or, alternatively, as their widths decrease towards zero). We refer to this convergence in the limit as the discretized diffusion approximation or (more compactly) the discrete approximation. We realized that this discrete approximation of Boucher and Demery (2016) unlocked a number of additional models for the phylogenetic comparative analysis of discrete and continuous trait data, and we explore several of these in the present article. Specifically, we examine application of this discretized diffusion approximation to the threshold model from evolutionary quantitative genetics, to a new "semi-threshold" trait evolution model, to a joint model of discrete and continuous traits in which the discrete trait influences the rate of evolution of our continuous character, as well as a model where precisely the converse is true, and to a discrete character dependent multi-trend trended continuous trait evolution model. We conclude with some context for the origins of our article and discussion of other possible applications of this powerful approach.
Zhan, Q.; Pascual, M.; He, Q.
Show abstract
Major surface antigens in many pathogens are encoded by rapidly diversifying multigene families, generating fitness variation through antigenic and functional differences. These variations align with the niche and absolute fitness axes of Modern Coexistence Theory (MCT). Yet, how such gene families evolve along these axes under competition for hosts and across transmission gradients remains poorly understood, as prior MCT studies have not explicitly accounted for evolutionary dynamics in high dimensions. We use a stochastic computational model of Plasmodium falciparum transmission to examine how transmission intensity and selection shape var multigene family evolution and composition within parasite genomes. Results show that selection alone cannot maintain the observed stable ratio of two gene groups within parasite genomes, indicating that group-based classifications do not clearly reflect transmission strategy or virulence. When a trade-off exists between diversification rates and absolute fitness, strong immune selection under high transmission favors fast-recombining genes while attenuating functional selection on R0-associated traits. In general, stronger immune selection increases the invasion probability of novel antigens and the niche differentiation among parasite genomes, while reducing the variance in gene-level transmissibility and expression duration, and therefore R0. This outcome, combining enhanced niche differentiation and reduced absolute fitness variation, departs from MCT predictions.
Berisha, E.; Sanchez, E. L.
Show abstract
Kaposis Sarcoma Herpesvirus (KSHV), an enveloped double-stranded DNA virus, is the etiological agent of Kaposis sarcoma (KS), an endothelial cell-based tumor. KSHV is a leading cause of infection-related cancers in sub-Saharan Africa and immunocompromised individuals worldwide. Therefore, it is vital to identify the underlying mechanisms of viral infection and transmission to effectively identify specific therapeutic strategies and combat the disease. Here, we demonstrate that KSHV rewires the host cell lipidome during lytic infection. Bulk lipidomic analysis shows significant changes in the abundance of neutral lipids and phospholipids during lytic infection. We further investigated fatty acid-binding proteins (FABPs) to understand the underlying mechanisms that support KSHV pathogenesis. Using the doxycyclin-inducible iSLK.BAC16 cell line, we find that FABP genes are differentially regulated by lytic KSHV infection compared to latent infection. We report that FABP4 is significantly upregulated during lytic infection. Loss of FABP4 during lytic infection does not impact viral gene transcription however, lytic protein translation is reduced. Moreover, our intracellular and extracellular viral titers indicate that FABP4 affects maximal infectious virion production. This study highlights the role of FABP4 and its therapeutic potential as a target that facilitates KSHV infection and pathogenesis.
Fronik, S.; Wolff, G.; Limpens, R. W. A. L.; de Jong, A. W. M.; Zheng, S.; Agard, D. A.; Koster, A. J.; Snijder, E. J.; Barcena, M.
Show abstract
Upon infection, arteriviruses, coronaviruses, and other nidoviruses transform endoplasmic reticulum membranes into viral replication organelles. These include large numbers of double-membrane vesicles (DMVs) whose interior is considered the primary site of viral RNA synthesis. Early studies characterized nidovirus DMVs as sealed compartments, leaving it unclear how newly synthesized viral RNA could be exported to the cytosol. The discovery of DMV-spanning pore complexes in coronavirus-infected cells provided a plausible solution for this topological challenge. However, their structural organization, functional features, and evolutionary conservation across the nidovirus order, have remained unclear. Here, we investigated the macromolecular architecture of DMVs induced by two prototypic arteriviruses using cellular cryo-electron tomography. Despite the substantial evolutionary distance separating arteriviruses and coronaviruses, we observed DMV-spanning pore complexes with striking structural similarities to those previously described in coronaviruses. These pores appear to facilitate both export and encapsidation of viral RNA. In the absence of viral RNA synthesis, ectopic expression of the arterivirus transmembrane nonstructural proteins nsp2 and nsp3 sufficed to induce the formation of pore-containing DMVs. Together, our findings reveal the conservation of key structural features of DMV pores across two distantly related nidovirus families and support a central role for these pores in nidovirus replication.
Hawkey, J.; Nodari, C. S.; Iqbal, Z.; Hunt, M.; Wick, R. R.; Chong, C. E.; Jenkins, C.; Howden, B. P.; Holt, K.; Weill, F.-X.; Baker, K. S.; Ingle, D. J.
Show abstract
Shigella flexneri is the leading causative agent of shigellosis globally. The public health threat posed by S. flexneri is compounded by its emergence as a sexually transmissible infection, importance of international travel in driving dissemination, and the increasing prevalence of antimicrobial resistance (AMR). A rapid and robust computational method is needed to enhance genomic surveillance and systematically explore features of the population structure of this WHO priority pathogen, which is scalable and readily implementable across jurisdictions, particularly as vaccine development efforts are underway. Here, we present Flex-It, a genomic framework and genotyping scheme implemented in Mykrobe for S. flexneri serotypes 1-5, X & Y, compatible with previous approaches used to describe S. flexneris population structure. To develop Flex-It, we curated a retrospective dataset of 5,819 publicly available S. flexneri genomes. We characterised the global population structure for S. flexneri, exploring geographical and temporal traits, and showed the granular diversity of AMR and serotype profiles. We applied Flex-It to >13,000 genomes routinely generated by public health laboratories from Australia, the UK and the USA across a ten-year period. We found significant genotype diversity in all three locations, with the emergence of genotypes with converged resistance to all major drugs currently used for treatment. Flex-It provides an open-source, novel genotyping method that rapidly characterises S. flexneri and its ciprofloxacin resistance determinants in <1 minute from both short and long whole-genome sequencing reads. Flex-It provides the community with a standardised nomenclature to monitor the emergence and spread of S. flexneri lineages.
Qian, K.; Abhyankar, V.; Keo, D.; Zarceno, P.; Toy, T.; Eskin, E.; Arboleda, V. A.
Show abstract
Sequencing the respiratory tract transcriptome has the potential to provide insights into infectious pathogens and the hosts immune response. While DNA-based sequencing is more standard in clinical laboratories due to its stability, RNA assays offer unique advantages. RNA reflects dynamic physiological changes, and for RNA viruses, viral RNA particles directly represent copies of the viral genome, enabling greater diagnostic sensitivity. However, RNAs susceptibility to degradation remains a significant challenge, particularly in RNase-rich specimens like saliva. To address this, we conducted a systematic, combinatorial evaluation of 24 distinct mNGS workflows, crossing eight nucleic acid extraction methods with three RNA-Seq library preparation protocols. Remnant saliva samples (n = 6) were pooled and spiked with MS2 phage as a control. The SARS-CoV-2 virus was spiked into half of the samples, which were extracted using the eight different extraction methods (n = 3) and compared using RNA Integrity Number equivalent (RINe) scores and RNA concentration. The extracted RNA was then processed across the three library construction methods and subjected to short-read sequencing to assess all 24 combinations head-to-head. We compared methods based on viral read recovery and found that RINe and concentration did not correlate with viral detection. The Zymo Quick-RNA Magbead kit and the Tecan Revelo RNA-Seq High-Sensitivity RNA library kit were the extraction and library-preparation kits that yielded the most SARS-CoV-2 reads, respectively. Importantly, our combinatorial analysis revealed that any small variability attributable to different nucleic acid extraction methods was heavily overshadowed by differences in quality attributable to the RNA-Seq library preparation methods. These findings challenge the reliance on conventional RNA quality metrics for clinical metagenomics and underscore the need to redefine extraction quality standards for mNGS applications. IMPORTANCEmNGS is a powerful and unbiased approach towards pathogen detection that has mostly been applied to blood and cerebrospinal fluid samples. However mNGS has recently been applied to more areas including the respiratory pathogen detection space, with potential applications in both in-patient diagnostics and public health surveillance. Saliva samples are an ideal sample type for these use cases since they can be collected non-invasively. However, saliva is also a challenging sample type due to its high RNase activity and often yields low-quality nucleic acid. This study explores the feasibility of using saliva specimens in mNGS with contrived SARS-CoV-2 samples to optimize the combination of two factors: nucleic acid extraction and RNA-seq library preparation. Exploration in this area could enhance the sensitivity of saliva-based mNGS assays, with the goal of future expansion of this specimen type in clinical diagnostics and public health surveillance. Key PointsO_LIThe choice of RNA-Seq library preparation kit has a greater impact on pathogen detection than the nucleic acid extraction method. C_LIO_LIThe combination of Zymo Quick-RNA Magbead extraction kit and TECAN Revelo RNA-Seq High Sensitivity RNA library kit recovered the highest percentage of total SARS-CoV-2 reads. C_LIO_LIRNA quantity and RINe score do not correlate with viral read capture, indicating a need for an alternative metric to assess RNA quality for downstream mNGS clinical diagnostics. C_LI
Hassanzadeh, R.; Abdollahi, N.; Kossida, S.; Giudicelli, V.; Eslahchi, C.
Show abstract
High-throughput B-cell receptor sequencing has transformed the analysis of adaptive immunity, but benchmarking clonal grouping and lineage reconstruction methods remains limited by the absence of datasets with known evolutionary histories. Here we present Ancestra, a lineage-explicit simulator of B-cell receptor heavy-chain affinity maturation. Ancestra models stochastic V(D)J recombination, context-dependent somatic hypermutation, affinity-based selection and clonal expansion while recording complete parent-child relationships and mutation events. The framework generates BCR heavy-chain sequence datasets together with their corresponding ground-truth lineage trees, enabling direct benchmarking of lineage-aware analytical methods. Across simulations, Ancestra recapitulates key properties of human repertoires, including complementarity-determining region 3 length distributions, amino-acid usage patterns, junctional mutation patterns consistent with IMGT criteria and heterogeneous branching topologies. Simulated lineages also reveal multi-label lineage trees, in which identical nucleotide sequences can arise independently along distinct evolutionary paths. Ancestra provides a practical foundation for rigorous benchmarking of lineage-aware immune repertoire analysis.
Bahig, S.; Oughton, M.; Vandesompele, J.; Brukner, I.
Show abstract
In dense urban settings, delays between diagnostic sampling and effective isolation can sustain transmission during peak infectiousness. We define a waiting-window transmission externality arising when infectious individuals remain mobile while awaiting results, formalized as E = N{middle dot}P{middle dot}TR{middle dot}D, where N is daily testing volume, P test positivity, TR transmission during the waiting period, and D turnaround time. Using Monte Carlo simulation and a susceptible-infectious-recovered (SIR) framework, we quantify excess infections per 1,000 tests/day under multiple diagnostic workflows. A surge scenario incorporates positive coupling between TR and D ({rho} = 0.45), reflecting co-occurrence of laboratory saturation and elevated contacts during system stress. Under centralized 48-hour workflows, excess infections reach [~]80 at P = 10% and [~]401 at P = 50%, increasing to [~]628 under surge conditions. In contrast, near-patient rapid testing and home sampling reduce this to [~]5 and [~]25-26, respectively. Workflows that eliminate the waiting window--either through immediate isolation at sampling or through home-based PCR that returns results at the point of collection--effectively collapse the transmission term. These findings identify diagnostic delay as a modifiable driver of epidemic dynamics. Operational redesign of testing workflows, including decentralized sampling and home-based molecular diagnostics, offers a scalable pathway to improve epidemic controllability and reduce inequities in dense urban environments.
Le Nagard, L.; Schwarz-Linek, J.; Krasnopeeva, E.; Douarche, C.; Arlt, J.; Dawson, A.; Martinez, V.; Poon, W. C. K.; Pilizota, T.
Show abstract
We study an unexpectedly fast decay of motility in dense suspensions of Escherichia coli bacteria supplied with excess glucose under anaerobic conditions. The decrease in swimming speed occurs on a timescale inversely proportional to the cell concentration, and is associated with the secretion of organic acids by the bacteria. We show that the decay is driven by the progressive accumulation of non-ionised organic acids in the medium, and develop a chemical kinetic model that successfully predicts the swimming speed variations over a range of conditions in the presence of these acids. We further measure the internal pH of E. coli cells exposed to organic acids, and find that the speed decay coincides with sharp declines in internal pH and metabolic rate. Our findings identify an additional layer of motility control that can arise in complex environments even when motility genes are expressed and energy sources are abundant. This mechanism is likely relevant for understanding bacterial motility in habitats such as the human gut, where high densities of bacteria and organic acids are common.
Musonda, R.; Ito, K.; Omori, R.; Ito, K.
Show abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has continuously evolved since its emergence in the human population in 2019. As of 1st August 2025, more than 1,700 Omicron subvariants have been designated by the Pango nomenclature system. The Pango nomenclature system designates a new lineage based on genetic and epidemiological information of SARS-CoV-2 strains. However, there is a possibility that strains that have similar genetic backgrounds and the same phenotype are given different Pango lineage names. In this paper, we propose a new algorithm, called FindPart-w, which can identify groups of viral lineages that share the same relative effective reproduction numbers. We introduced a new lineage replacement model, called the constrained RelRe model, which constrains groups of lineages to have the same relative effective reproduction numbers. The FindPart-w algorithm searches the equality constraints that minimise the Akaike Information Criterion of constrained RelRe models. Using hypothetical observation count data created by simulation, we found that the FindPart-w algorithm can identify groups of lineages having the same relative effective reproduction number in a practical computational time. Applying FindPart-w to actual real-world data of time-stamped lineage counts from the United States, we found that the Pango lineage nomenclature system may have given different lineage names to SARS-CoV-2 strains even if they have the same relative effective reproduction number and similar genetic backgrounds. In conclusion, this study showed that viruses that had the same relative effective reproduction number were identifiable from temporal count data of viral sequences. These findings will contribute to the future development of lineage designation systems that consider both genetic backgrounds and transmissibilities of lineages.
Rignault, G.; Merle, M.; Folly-Ramos, E.; Almeida, C. E.; Harry, M.; Filee, J.
Show abstract
Triatominae bugs are the main vectors of Chagas disease in Latin America and rely on microbial nutritional symbiosis to complement their haematophagous diet with B-vitamins. While Rhodococcus bacteria have been identified as key symbionts, diverse metabarcoding analyses have suggested additional candidates. However, symbiont genomic data and metabolic capabilities remain largely uncharacterized. To address this gap, we generated metagenomic assemblies for 14 Triatominae and captured 15 bacterial genomes belonging to 4 genera (Rhodococcus, Wolbachia, Symbiopectobacterium and Arsenophonus) across 9 triatominae species. We identified five co-infection cases, including one involving two distinct Arsenophonus symbionts, one exhibiting hallmarks of massive genome degradation. Phylogenetic analyses revealed that Triatominae-associated symbionts form monophyletic groups within each genus, suggesting common origins followed by co-evolution with their hosts. Annotation of vitamin B metabolic genes indicates that most symbionts harbour incomplete pathways, with evidence of metabolic complementation between co-infecting symbionts. Additionally, we identified bacterial genes laterally transferred into host insect genomes, interpreted as footprints of present or past symbiotic associations. Nearly all Triatominae genomes displayed transferred genes from all four bacterial genera, including hosts with no detectable symbiont in genome assemblies. Taken together with these discoveries support the existence of a stable and limited network of four possible nutritional symbiont lineages with rare evidence of symbiont turn-overs. Significance statementTriatominae bugs, vectors of Chagas disease, are known to harbor a diverse community of nutritional bacterial symbionts whose genomic and metabolic roles have remained largely unexplored. By reconstructing 15 symbiont genomes that segregate as four bacterial genera, we provide important insight into the origins, the evolution and the metabolic structure of the nutritional symbiosis in triatominae. These findings support a stable, evolutionary conserved network of nutritional symbionts with limited turnover.
RAZAFIMAHATRATRA, S. L.; RASOLOHARIMANANA, L. T.; ANDRIAMARO, T. M.; RANAIVOMANANA, P.; SCHOENHALS, M.
Show abstract
Interpreting serological data remains challenging, particularly in low prevalence or cross reactive contexts, where antibody responses often show substantial overlap between exposed and unexposed individuals and may depart from normal distributional assumptions. Conventional cutoff based approaches often yield inconsistent or biased estimates of seroprevalence. Here, we present a decisional framework based on finite mixture models (FMMs) that enhances the robustness and interpretability of serological analyses. Beyond simply applying mixture models, our framework integrates multiple methodological innovations : (i) systematic comparison of Gaussian and skew normal mixture models to accommodate asymmetric antibody distributions; (ii) rigorous model selection using the Cramer von Mises test (p > 0.01) combined with a parsimonious score (APS) to prioritize models with well separated clusters; and (iii) hierarchical clustering of posterior probabilities to collapse latent components into biologically meaningful seronegative and seropositive groups. Applied to chikungunya virus (CHIKV) data from Bangladesh, the framework produced prevalence estimates consistent with ROC based methods while probabilistically identifying borderline cases. Validation on SARS CoV 2 and dengue datasets further demonstrated its generalizability: for SARS CoV 2, the approach identified up to five latent clusters with high sensitivity (up to 100%) and specificity (up to 100%), enabling discrimination by disease severity. For dengue, it revealed interpretable subgrouping consistent with background exposure and subclinical infection, despite limited confirmed cases. By integrating distributional flexibility, robust goodness of fit testing, and biologically guided cluster consolidation, this decisional FMM framework provides a reproducible and scalable method for serological interpretation across pathogens and epidemiological settings, addressing key limitations of threshold based classification.
Hekstra, D. R.; Wang, H. K.; Choe, A. K.
Show abstract
Perturbative X-ray crystallography can visualize functional dynamics and conformational changes in proteins at atomic resolution. During a typical perturbative crystallography experiment, only a fraction of protein molecules in a crystal will be perturbed, or "excited". As a result, the observed data represent a mixture of excited and ground states. The conventional approach to estimating the excited-state structure factor amplitudes is to linearly extrapolate the difference between the structure factor amplitudes of the perturbed and unperturbed data. This approach often fails to yield well-refined structural models because it amplifies experimental errors and neglects phase differences between the ground and excited states. Here, we introduce an approach to estimating excited-state structure factor amplitudes that starts from a statistical prior for the correlations between excited and ground states. Using benchmarks from time-resolved crystallography and a drug-fragment screen, we illustrate how this approach effectively addresses the limitations of traditional extrapolation.
Sy, M.; Ndiaye, T.; Thakur, R.; Gaye, A.; Levine, Z. C.; Ngom, B.; Bellavia, K. L.; Firer, D.; Toure, M.; Ndiaye, I. M.; Diedhiou, Y.; Mbaye, A. M.; Gomis, J. F.; DeRuff, K. C.; Deme, A. B.; Ndiaye, M.; Badiane, A. S.; Paye, M. F.; Sabeti, P. C.; Ndiaye, D.; Siddle, K. J.
Show abstract
Emerging infectious diseases and antimicrobial resistance (AMR) have surfaced as two major public health threats over the past two decades. Consequently, integrative surveillance systems capable of detecting both emerging pathogens and resistance-carrying bacteria are crucial. With advances in next-generation sequencing, simultaneous detection of pathogens and AMR is increasingly feasible. In this study, we used short-read metatranscriptomics complemented by total 16S rRNA metagenomic long-read sequencing to analyze paired oral and plasma samples from a cohort of febrile individuals at two locations in Senegal. Oral microbiomes differed in community composition between locations, and reduced diversity and richness were significantly associated with high fever. We identified at least one known pathogen in 15.33 % (23/150) of samples, with Borrelia crocidurae as the most frequently detected pathogen. We detected both pathogenic and non-pathogenic viruses in oral (10/72) and plasma (09/78) samples. Finally, we observed a high frequency of genes associated with resistance and virulence: 10% of samples expressed at least one AMR gene (ARG), and 24% expressed virulence factor genes. Resistance to widely used beta-lactam antibiotics was the most prevalent. Our findings provide critical data on oral and plasma microbiomes in the context of acute febrile illness in Senegal while expanding understanding of circulating ARGs.